Learning Word Sense With Feature Selection and Order Identification Capabilities

نویسندگان

  • Zheng-Yu Niu
  • Dong-Hong Ji
  • Chew Lim Tan
چکیده

This paper presents an unsupervised word sense learning algorithm, which induces senses of target word by grouping its occurrences into a “natural” number of clusters based on the similarity of their contexts. For removing noisy words in feature set, feature selection is conducted by optimizing a cluster validation criterion subject to some constraint in an unsupervised manner. Gaussian mixture model and Minimum Description Length criterion are used to estimate cluster structure and cluster number. Experimental results show that our algorithm can find important feature subset, estimate model order (cluster number) and achieve better performance than another algorithm which requires cluster number to be provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theme: A Study of Classifier Combination and Semi-Supervised Learning for Word Sense Disambiguation

1. Aims Word Sense Disambiguation (WSD) involves the association of a polysemous word in a text or discourse with a particular sense among numerous potential senses of that word. In my thesis, we present a study of classifier combination and semi-supervised learning for WSD, which aim to boost supervised WSD and improve accuracy of WSD. In addition, we also work on context representation and fe...

متن کامل

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...

متن کامل

Disambiguation with Feature Selection and Semi - Supervised Learning ”

1. Objective Word Sense Disambiguation (WSD) is the task of determining the right sense of a polysemous word in a given context. This study aims to enhance the performance of supervised-based word sense determination by focusing on feature selection and using bootstrapping techniques. Senses determination of a word is essentially based on the information extracted from the context in which this...

متن کامل

Word sense disambiguation with pattern learning and automatic feature selection

This paper presents a novel approach for word sense disambiguation. The underlying algorithm has two main components: (1) pattern learning from available sense-tagged corpora (SemCor), from dictionary definitions (WordNet) and from a generated corpus (GenCor); and (2) instance based learning with automatic feature selection, when training data is available for a particular word. The ideas descr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004